2024-08-09 16:26:57.AIbase.11.0k
Mobile Large Model Runs 5 Times Faster! Microsoft Research Asia Opens New Technology for Ultra-Fast Experience on CPUs
The T-MAC (Table-Lookup-based MAC) technology aims to address the memory and computational constraints of deploying large language models (LLMs) on edge devices. By quantizing model weights into low-bit representations, T-MAC replaces traditional multiplication operations with look-up tables (LUT), significantly enhancing running efficiency on CPUs. This method greatly reduces the memory required for calculations, enabling billion-parameter LLMs to run efficiently on resource-constrained devices for smart upgrades. Compared to existing implementations, T-